Overview

Dataset statistics

Number of variables9
Number of observations1163
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory81.9 KiB
Average record size in memory72.1 B

Variable types

NUM9

Reproduction

Analysis started2020-12-20 14:06:17.743727
Analysis finished2020-12-20 14:06:37.472952
Versionpandas-profiling v2.6.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
P7 is highly skewed (γ1 = 33.47667454) Skewed
P9 is highly skewed (γ1 = 20.71477787) Skewed
P3 has 56 (4.8%) zeros Zeros
P4 has 57 (4.9%) zeros Zeros

Variables

P1
Real number (ℝ)

Distinct count985
Unique (%)84.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9595240027
Minimum-0.312219463
Maximum1
Zeros0
Zeros (%)0.0%
Memory size9.2 KiB

Quantile statistics

Minimum-0.312219463
5-th percentile0.7970839442
Q10.9830515365
median0.999315193
Q30.999993262
95-th percentile1
Maximum1
Range1.312219463
Interquartile range (IQR)0.0169417255

Descriptive statistics

Standard deviation0.1278252788
Coefficient of variation (CV)0.1332173854
Kurtosis36.17709673
Mean0.9595240027
Median Absolute Deviation (MAD)0.05980684393
Skewness-5.51351804
Sum1115.926415
Variance0.01633930191
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.31221946 0.59576948 0.79787597 0.91776083 0.96761804 ... 0.99999906 0.99999996 1. 1. 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 165 14.2%
 
0.999999999 6 0.5%
 
0.999999998 5 0.4%
 
0.999999517 2 0.2%
 
0.999999995 2 0.2%
 
0.999397152 2 0.2%
 
0.999999986 2 0.2%
 
0.999999997 2 0.2%
 
0.935332189 1 0.1%
 
0.99574578 1 0.1%
 
Other values (975) 975 83.8%
 
ValueCountFrequency (%) 
-0.312219463 1 0.1%
 
-0.166308476 1 0.1%
 
-0.133945433 1 0.1%
 
-0.122684763 1 0.1%
 
-0.009317187 1 0.1%
 
ValueCountFrequency (%) 
1 165 14.2%
 
0.999999999 6 0.5%
 
0.999999998 5 0.4%
 
0.999999997 2 0.2%
 
0.999999996 1 0.1%
 

P2
Real number (ℝ)

Distinct count1092
Unique (%)93.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8561658259
Minimum-0.531958013
Maximum1
Zeros1
Zeros (%)0.1%
Memory size9.2 KiB

Quantile statistics

Minimum-0.531958013
5-th percentile0.2810347777
Q10.8406747645
median0.969780138
Q30.99657805
95-th percentile1
Maximum1
Range1.531958013
Interquartile range (IQR)0.1559032855

Descriptive statistics

Standard deviation0.244010227
Coefficient of variation (CV)0.285003465
Kurtosis5.89665047
Mean0.8561658259
Median Absolute Deviation (MAD)0.1693315585
Skewness-2.3976505
Sum995.7208556
Variance0.05954099086
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.53195801 -0.00300441 0.393161 0.71103978 0.83790529 ... 0.99873578 0.99978726 0.99999676 0.99999998 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 72 6.2%
 
0.583596373 1 0.1%
 
0.6754244 1 0.1%
 
0.920282853 1 0.1%
 
0.87016915 1 0.1%
 
0.971404163 1 0.1%
 
0.999058744 1 0.1%
 
0.968609748 1 0.1%
 
0.995743472 1 0.1%
 
0.842575984 1 0.1%
 
Other values (1082) 1082 93.0%
 
ValueCountFrequency (%) 
-0.531958013 1 0.1%
 
-0.314414503 1 0.1%
 
-0.291742386 1 0.1%
 
-0.280000339 1 0.1%
 
-0.24824377 1 0.1%
 
ValueCountFrequency (%) 
1 72 6.2%
 
0.999999957 1 0.1%
 
0.999999861 1 0.1%
 
0.999999585 1 0.1%
 
0.99999927 1 0.1%
 

P3
Real number (ℝ)

ZEROS
Distinct count1108
Unique (%)95.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2161109005
Minimum-0.818128394
Maximum1
Zeros56
Zeros (%)4.8%
Memory size9.2 KiB

Quantile statistics

Minimum-0.818128394
5-th percentile-0.4198713207
Q1-0.0982188595
median0.175910409
Q30.5643935345
95-th percentile0.869927032
Maximum1
Range1.818128394
Interquartile range (IQR)0.662612394

Descriptive statistics

Standard deviation0.4084255292
Coefficient of variation (CV)1.889888609
Kurtosis-0.9339333351
Mean0.2161109005
Median Absolute Deviation (MAD)0.3497818914
Skewness0.03017237361
Sum251.3369773
Variance0.1668114129
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-8.18128394e-01 -5.16893606e-01 -2.64677795e-01 -5.06667500e-04 9.16040500e-04 8.54898466e-01 1.00000000e+00], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 56 4.8%
 
0.433580298 1 0.1%
 
0.005992013 1 0.1%
 
-0.166124956 1 0.1%
 
-0.467118981 1 0.1%
 
0.570724625 1 0.1%
 
-0.743152886 1 0.1%
 
-0.495672031 1 0.1%
 
0.847387914 1 0.1%
 
0.587378552 1 0.1%
 
Other values (1098) 1098 94.4%
 
ValueCountFrequency (%) 
-0.818128394 1 0.1%
 
-0.793428438 1 0.1%
 
-0.743152886 1 0.1%
 
-0.732418095 1 0.1%
 
-0.726504113 1 0.1%
 
ValueCountFrequency (%) 
1 1 0.1%
 
0.999397198 1 0.1%
 
0.99714357 1 0.1%
 
0.992045251 1 0.1%
 
0.988913842 1 0.1%
 

P4
Real number (ℝ)

ZEROS
Distinct count1107
Unique (%)95.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1491940316
Minimum-0.83915785
Maximum0.979015216
Zeros57
Zeros (%)4.9%
Memory size9.2 KiB

Quantile statistics

Minimum-0.83915785
5-th percentile-0.4522276887
Q1-0.1443266505
median0.097104705
Q30.4579815565
95-th percentile0.8095228938
Maximum0.979015216
Range1.818173066
Interquartile range (IQR)0.602308207

Descriptive statistics

Standard deviation0.3880719904
Coefficient of variation (CV)2.601122755
Kurtosis-0.7345647834
Mean0.1491940316
Median Absolute Deviation (MAD)0.3253962127
Skewness0.0906515688
Sum173.5126587
Variance0.1505998697
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-8.39157850e-01 -5.13500499e-01 -2.65090613e-01 -2.96563000e-04 2.08311500e-04 6.22406551e-01 8.97550725e-01 9.79015216e-01], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 57 4.9%
 
0.268198803 1 0.1%
 
-0.233215034 1 0.1%
 
0.450293131 1 0.1%
 
-0.250968088 1 0.1%
 
0.323311335 1 0.1%
 
-0.001275444 1 0.1%
 
-0.193477634 1 0.1%
 
-0.078511907 1 0.1%
 
0.427280059 1 0.1%
 
Other values (1097) 1097 94.3%
 
ValueCountFrequency (%) 
-0.83915785 1 0.1%
 
-0.821092412 1 0.1%
 
-0.810818562 1 0.1%
 
-0.792628348 1 0.1%
 
-0.777157934 1 0.1%
 
ValueCountFrequency (%) 
0.979015216 1 0.1%
 
0.970962506 1 0.1%
 
0.967362541 1 0.1%
 
0.958902237 1 0.1%
 
0.948303812 1 0.1%
 

P5
Real number (ℝ)

UNIQUE
Distinct count1163
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6003686052
Minimum-0.68273404
Maximum0.999998999
Zeros0
Zeros (%)0.0%
Memory size9.2 KiB

Quantile statistics

Minimum-0.68273404
5-th percentile-0.0669913822
Q10.384498611
median0.693157154
Q30.87393708
95-th percentile0.9859021497
Maximum0.999998999
Range1.682733039
Interquartile range (IQR)0.489438469

Descriptive statistics

Standard deviation0.3346911039
Coefficient of variation (CV)0.5574760256
Kurtosis0.312743063
Mean0.6003686052
Median Absolute Deviation (MAD)0.2744050866
Skewness-0.9336641343
Sum698.2286878
Variance0.1120181351
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.68273404 -0.28293661 0.05460237 0.26245205 0.68678023 0.94660236 0.99802858 0.999999 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.303700424 1 0.1%
 
0.458011471 1 0.1%
 
0.217011244 1 0.1%
 
0.864190541 1 0.1%
 
0.996746606 1 0.1%
 
0.633403479 1 0.1%
 
0.974599729 1 0.1%
 
0.907571918 1 0.1%
 
0.988915121 1 0.1%
 
0.364442719 1 0.1%
 
Other values (1153) 1153 99.1%
 
ValueCountFrequency (%) 
-0.68273404 1 0.1%
 
-0.626728179 1 0.1%
 
-0.574856356 1 0.1%
 
-0.511736404 1 0.1%
 
-0.4800747 1 0.1%
 
ValueCountFrequency (%) 
0.999998999 1 0.1%
 
0.99994973 1 0.1%
 
0.999867966 1 0.1%
 
0.999844953 1 0.1%
 
0.999637227 1 0.1%
 

P6
Real number (ℝ)

Distinct count1157
Unique (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5248452312
Minimum-0.859528505
Maximum1
Zeros0
Zeros (%)0.0%
Memory size9.2 KiB

Quantile statistics

Minimum-0.859528505
5-th percentile-0.2091459033
Q10.2437151465
median0.594119555
Q30.8680788495
95-th percentile0.9956006918
Maximum1
Range1.859528505
Interquartile range (IQR)0.624363703

Descriptive statistics

Standard deviation0.3865651475
Coefficient of variation (CV)0.7365316944
Kurtosis-0.4922840912
Mean0.5248452312
Median Absolute Deviation (MAD)0.3262234297
Skewness-0.6230076373
Sum610.3950039
Variance0.1494326132
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.8595285 -0.50582192 -0.26328153 0.00437773 0.34891911 ... 0.96657649 0.98606609 0.99873226 0.99999998 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 7 0.6%
 
-0.387734855 1 0.1%
 
0.991244238 1 0.1%
 
0.999754995 1 0.1%
 
0.672077403 1 0.1%
 
0.875194235 1 0.1%
 
-0.138491348 1 0.1%
 
0.256862182 1 0.1%
 
-0.002255474 1 0.1%
 
0.546864316 1 0.1%
 
Other values (1147) 1147 98.6%
 
ValueCountFrequency (%) 
-0.859528505 1 0.1%
 
-0.714135509 1 0.1%
 
-0.641931242 1 0.1%
 
-0.52004809 1 0.1%
 
-0.491595743 1 0.1%
 
ValueCountFrequency (%) 
1 7 0.6%
 
0.999999961 1 0.1%
 
0.999991993 1 0.1%
 
0.999991252 1 0.1%
 
0.999985324 1 0.1%
 

P7
Real number (ℝ)

SKEWED
Distinct count1161
Unique (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0009884078831
Minimum-0.0353133
Maximum34.36719456
Zeros0
Zeros (%)0.0%
Memory size9.2 KiB

Quantile statistics

Minimum-0.0353133
5-th percentile-0.0337253732
Q1-0.0328484405
median-0.032548002
Q3-0.031953721
95-th percentile-0.0293531277
Maximum34.36719456
Range34.40250786
Interquartile range (IQR)0.0008947195

Descriptive statistics

Standard deviation1.015356315
Coefficient of variation (CV)1027.264485
Kurtosis1132.434593
Mean0.0009884078831
Median Absolute Deviation (MAD)0.06595726222
Skewness33.47667454
Sum1.149518368
Variance1.030948447
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-3.53133000e-02 -3.41203185e-02 -3.32155560e-02 -3.29306505e-02 -3.27766605e-02 ... -2.90472840e-02 -2.78040440e-02 -2.08401355e-02 1.68666090e-02 3.43671946e+01], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-0.032731159 2 0.2%
 
-0.032451159 2 0.2%
 
-0.032900568 1 0.1%
 
-0.034532881 1 0.1%
 
-0.032877649 1 0.1%
 
-0.032550813 1 0.1%
 
-0.0353133 1 0.1%
 
-0.033468589 1 0.1%
 
-0.031321789 1 0.1%
 
-0.032593534 1 0.1%
 
Other values (1151) 1151 99.0%
 
ValueCountFrequency (%) 
-0.0353133 1 0.1%
 
-0.035115316 1 0.1%
 
-0.035102945 1 0.1%
 
-0.035011336 1 0.1%
 
-0.034942826 1 0.1%
 
ValueCountFrequency (%) 
34.36719456 1 0.1%
 
3.957173837 1 0.1%
 
0.022531708 1 0.1%
 
0.01120151 1 0.1%
 
-0.006127962 1 0.1%
 

P8
Real number (ℝ)

Distinct count1158
Unique (%)99.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.002699656731
Minimum-1.066400989
Maximum2.177948117
Zeros0
Zeros (%)0.0%
Memory size9.2 KiB

Quantile statistics

Minimum-1.066400989
5-th percentile-1.052579491
Q1-0.8884636035
median-0.345308732
Q30.7029833895
95-th percentile1.987846283
Maximum2.177948117
Range3.244349106
Interquartile range (IQR)1.591446993

Descriptive statistics

Standard deviation0.9968683947
Coefficient of variation (CV)-369.2574627
Kurtosis-0.6787063579
Mean-0.002699656731
Median Absolute Deviation (MAD)0.8508799522
Skewness0.770945336
Sum-3.139700778
Variance0.9937465963
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-1.06640099 -1.06265361 -1.00926593 -0.88479891 -0.25291386 0.80339605 2.12440416 2.17793931 2.17794812], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2.177948117 6 0.5%
 
-0.943480309 1 0.1%
 
-1.010018329 1 0.1%
 
1.910211434 1 0.1%
 
1.419544597 1 0.1%
 
-0.709115296 1 0.1%
 
-0.796709507 1 0.1%
 
0.405615593 1 0.1%
 
0.788053006 1 0.1%
 
-0.521169406 1 0.1%
 
Other values (1148) 1148 98.7%
 
ValueCountFrequency (%) 
-1.066400989 1 0.1%
 
-1.066298572 1 0.1%
 
-1.066248334 1 0.1%
 
-1.066172838 1 0.1%
 
-1.066153211 1 0.1%
 
ValueCountFrequency (%) 
2.177948117 6 0.5%
 
2.177930513 1 0.1%
 
2.177785659 1 0.1%
 
2.174331346 1 0.1%
 
2.173316021 1 0.1%
 

P9
Real number (ℝ)

SKEWED
UNIQUE
Distinct count1163
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.02701195877
Minimum-0.054447739
Maximum7.486295884
Zeros0
Zeros (%)0.0%
Memory size9.2 KiB

Quantile statistics

Minimum-0.054447739
5-th percentile-0.0543631249
Q1-0.05424374
median-0.053821459
Q3-0.051892963
95-th percentile-0.0196290455
Maximum7.486295884
Range7.540743623
Interquartile range (IQR)0.002350777

Descriptive statistics

Standard deviation0.2868273153
Coefficient of variation (CV)-10.61853077
Kurtosis487.2921749
Mean-0.02701195877
Median Absolute Deviation (MAD)0.04770667544
Skewness20.71477787
Sum-31.41490805
Variance0.0822699088
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.05444774 -0.05438816 -0.05419332 -0.05384821 -0.05333875 ... -0.03659348 -0.01855106 0.13138823 0.61931158 7.48629588], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-0.054201659 1 0.1%
 
-0.05439452 1 0.1%
 
-0.052477972 1 0.1%
 
-0.054360879 1 0.1%
 
-0.053428821 1 0.1%
 
-0.054287977 1 0.1%
 
-0.054315462 1 0.1%
 
-0.054292927 1 0.1%
 
-0.049517502 1 0.1%
 
-0.054350018 1 0.1%
 
Other values (1153) 1153 99.1%
 
ValueCountFrequency (%) 
-0.054447739 1 0.1%
 
-0.054434554 1 0.1%
 
-0.054421333 1 0.1%
 
-0.054420913 1 0.1%
 
-0.054402215 1 0.1%
 
ValueCountFrequency (%) 
7.486295884 1 0.1%
 
4.959006766 1 0.1%
 
2.084434133 1 0.1%
 
1.496449212 1 0.1%
 
1.340106645 1 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

P1P2P3P4P5P6P7P8P9
00.595378-0.5319580.679654-0.1267990.4320460.988092-0.0298130.768742-0.054167
10.9822370.9914810.3376460.2281440.9200320.999985-0.0322592.161651-0.054350
20.9961620.8939870.7674130.6068400.9708080.882602-0.0322670.369607-0.054157
30.9999280.922748-0.444438-0.3712870.528038-0.221645-0.032692-1.0654390.381914
40.9858380.9375120.6995920.5852630.8388040.999602-0.0337132.089720-0.054379
50.9942660.9179550.4414210.5885410.7143760.997314-0.0300952.081266-0.054257
60.9994780.9893210.6541390.7051770.3505250.877974-0.0309840.605393-0.054116
70.9996460.275265-0.333569-0.4410220.5769630.607380-0.0321291.039586-0.054264
80.825371-0.2482440.2202380.1284780.3644430.951265-0.0302471.512327-0.054221
90.9999830.6294820.034663-0.0792240.9414350.878837-0.0324450.781914-0.054252

Last rows

P1P2P3P4P5P6P7P8P9
11531.0000000.9056990.0000000.0000000.2656840.872164-0.0327691.947519-0.054363
11541.0000000.964776-0.414177-0.4018360.9355400.687023-0.033069-0.591688-0.053674
11550.9999730.875812-0.141531-0.203786-0.0263990.739077-0.034112-0.384740-0.054110
11560.9891770.925273-0.327243-0.3665970.3298650.320186-0.032896-0.578614-0.053668
11570.9997180.5254210.089664-0.3249320.895527-0.133320-0.032415-0.992041-0.048131
11581.0000000.6708470.5669940.3254850.2630780.211872-0.030984-1.053454-0.004980
11590.9995030.9947840.8473880.8072940.9920710.779629-0.032152-0.867659-0.051979
11600.9999990.751775-0.209639-0.0520310.8184280.850427-0.032923-0.517706-0.053754
11610.9999920.9972770.1993120.1904360.5343540.119579-0.032986-1.031108-0.043029
11620.9999720.561715-0.152851-0.2571090.9880010.141271-0.032657-1.042708-0.035768